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DOCUMENT MANAGING APPARATUS 

Background of the Invention 
Field of the Invention 

The present invention relates to a document 
managing apparatus correlating and managing an original 
electronic document generated by a word processor, etc . , 
a file obtained by extracting a note that a person takes 
in a distributed paper document which is printed from 
the electronic document and by recognizing the extracted 
note, and an image file of the note. 

Description of the Related Art 

With the recent popularization of personal 
computers, a document that is conventionally printed 
on paper and used has been generated by a tool such as 
a word processor, etc., and the original data of the 
document has been managed as an electronic document. 

If an electronic document generated by a word 
processor, etc. is printed on paper and distributed at 
a meeting, etc., participants in the meeting take notes 
in the margin of the document in many cases. An existing 
document managing apparatus can manage an electronic 
document, but cannot handle a note. However, a note that 
a participant in a meeting takes in the margin, etc. 
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of a paper document includes important information, etc. 
of a discussion made at the meeting. Therefore, the paper 
document cannot be discarded. Eventually, the original 
electronic document and the paper document in which the 
5 note is taken are doubly managed, which leads to 
troublesomeness . 

As described above, a document printed on paper 
is distributed to persons who actually reference the 
document, and matters expected to be important are 
10 usually taken as notes in the paper document. Therefore, 
it is impossible to make fully electronic management 
of information. 

Summary of the Invention 

15 An object of the present invention is to provide 

a document managing apparatus simultaneously managing 
a note taken in a distributed paper document that is 
printed from an electronic document generated by a word 
processor, etc., and the original electronic document. 

2 0 A document managing apparatus according to the 

present invention is a document managing apparatus that 
electronically manages a note taken in a paper document 
printed from an electronic document. This apparatus 
comprises a reading unit reading as an image a document 

25 in which a note is taken, an extracting unit extracting 



information about the note from the read image, and a 
unit correlating and electronically storing the 
electronic document and the information about the note. 

Conventionally, information of a note taken in a 
paper document that is printed from an electronic 
document is stored by holding the paper document. 
However, according to the present invention, a note is 
electronically managed as information about a note, such 
as raw image data, its recognition result, etc. This 
eliminates the need for storing a paper document, so 
that information can be efficiently managed. Especially, 
an electronic document and information about a note are 
correlated and stored, whereby a user can obtain the 
information about a note by easily displaying the note 
correlated to the electronic document at any time. 

Brief Description of the Drawings 

Fig. 1 is a schematic diagram explaining the 
configuration and the operations of a document managing 
apparatus according to a preferred embodiment of the 
present invention; 

Fig. 2 is a flowchart showing a note 
extraction/recognition process; 

Fig. 3 explains the concept of a note region 
extraction process; 
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Fig. 4 is a flowchart showing a note information 
registration/ correlation process; 

Fig. 5 explains the format of a file stored in a 
document information file; 
5 Fig. 6 is a flowchart showing a document search 

process ; 

Fig. 7 exemplifies a display of a document list 
in the case where note data exists; 

Fig. 8 exemplifies a display of an original 
10 electronic document, a note recognition result, and a 
note image; 

Fig. 9 explains the hardware configuration of an 
information processing device that is required when the 
apparatus according to the preferred embodiment is 
15 implemented by causing the information processing 
device to execute a program; and 

Fig. 10 explains a use pattern of a program (data) . 

Description of the Preferred Embodiments 

20 A document managing apparatus according to a 

preferred embodiment of the present invention 
comprises: a function registering an electronic 
document; a paper document inputting function capturing 
as image data the electronic document distributed as 

25 a paper document in which a note is taken, by using a 



5 



scanner, an electronic camera, etc.; a note extracting 
function extracting only the note from the image in which 
the note is taken; a note managing unit recognizing the 
extracted note image portion, and putting the recognized 
5 note image into a file along with the corresponding 
image; and a file managing unit correlating and managing 
the original electronic document, the note file, and 
the note image. This apparatus can electronically manage 
a note and an electronic document at the same time, which 

10 leads to a reduction in troublesomeness of doubly 
managing paper and electronic documents, and to ease 
of reuse of data and information. 

Fig. 1 is a schematic diagram explaining the 
configuration and the operations of a document managing 

15 apparatus according to a preferred embodiment of the 
present invention. 

A user interface unit 1 is configured by a keyboard, 
a mouse, a display, etc., and allows a user interaction 
process. An electronic document registering unit 2 

20 registers an electronic document upon receipt of a user 
request from the user interface unit 1, and generates 
a document information file for holding information of 
each document, such as a pointer in a memory, a document 
name, an author name, a creation time, the number of 

25 pages, etc. The document information file will be 



described later. 

A paper document inputting unit 4 is configured 
by a scanner, etc., and captures a paper document as 
an image when a user issues a process request via the 
user interface unit 1. A note extracting unit 5 extracts 
a note image from the paper document image based on 
original electronic document data that a user specifies 
via the user interface unit 1 . A specific note extraction 
process will be described later. A note recognizing unit 
6 performs character recognition for the extracted note 
image while referencing a character recognition 
dictionary 13 . Since recognized characters can possibly 
include an error, a recognition result can be also 
corrected at this time. The correction is made with an 
existing technique. 

To a note registering unit 7, a note recognition 
result file is registered. A file managing unit 3 
correlates file information such as a note recognition 
file, a note image file, etc. to an original electronic 
document automatically or with a user specification, 
and writes the correlated information to a document 
information file. If the file information is correlated 
to the original electronic document automatically, 
electronic documents are searched based on the 
information of a character string or a ruled line of 
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a paper document image, so that a corresponding 
electronic document is found. For a search using a ruled 
line or a character string, by way of example, the 
technique disclosed by the invention of the pending 
5 application filed by the present applicant, or the 
technique disclosed by Japanese Patent Publication No. 
10-240958 is used. 

When a user issues a document search request via 
the user interface unit 1, a document searching unit 
10 8 interprets the user request, and requests the file 
managing unit 3 to search for a document. The file 
managing unit 3 accesses a document information file 
9, and searches for a corresponding file. If the user 
issues a word search request in all documents via the 
15 user interface unit 1, the file managing unit 3 accesses 
a file within the document information file 9, an 
original document file 10, and a note recognition result 
file 11 to make a word search. Furthermore, an original 
electronic document, a note recognition result, a note 
20 image, etc. are displayed according to a user request. 
In this case, the note image is read from a note image 
file 12 based on the information of the document 
information file 9. 

Besides, a function for calculating attribute 
25 information such as the location, the size, etc. of a 
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note, and searching for an electronized paper document 
by using the attribute information of a note may be 
arranged as a function for managing a note in a paper 
document. Furthermore, the file managing unit 3 has a 
5 function for managing and displaying an original 
electronic document, an electronized paper document, 
and information of presence of a note, which are 
correlated to one another, and also has a function for 
obtaining a desired document from the above described 

10 documents depending on need. 

Fig. 2 is a flowchart showing a note 
extraction/recognition process. 

The note extracting unit corrects a lean of a 
document image with a note, which a user inputs with 

15 a scanner, and corrects the image to be upright if the 
image has a lean. Furthermore, the note extracting unit 
makes a comparison between the document image with the 
note and a corresponding original electronic image, and 
removes a preprint (characters which are included in 

20 an electronic document, etc., and printed on paper) 
portion from the document image with the note. 
Specifically, a document image is generated from the 
original electronic document so that the generated image 
and the document image with the note become equal in 

25 size, and the preprint portion is removed with an 
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existing technique such as overlaying the generated 
image on the document image with the note. The remaining 
portion is then extracted with the techniques written 
by the following documents, etc. 
5 N. Babaguchi, M. Tsukamoto, and H. Aihara, 

"Fundamental Consideration of Character Extraction 
from a Handwritten Japanese Character String", IEICE 
Transactions Vol. J68-D, No. 12 2123-2131, December '85 
S. Fujii and K. Omori, "Handwritten Character 

10 String Recognition System Using a Character Extraction 
Process Based on a Contact Pattern of Characters - 
Development of a Character Code String Generator", 
Meeting on Image Recognition and Understanding (MIRU 
•94), July 1994, I-123-i-130 

15 The note recognizing unit performs character 

recognition for a note image which is obtained by 
character extraction by using a character recognition 
dictionary. 

The flow of the process is explained with 
20 reference to Fig. 2. 

Firstly, in step Si, a lean of a document image 
with a note is corrected. In step S2, an image is 
generated from an original electronic image. At this 
time, an electronic document to be read is identified 
25 by referencing a document information file, and the 
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identified electronic document is read from an 
electronic document file. Then, in step S3, a preprint 
is removed, for example, by overlaying the document 
image with the note and the document generated from the 
original electronic document. In step S4, characters 
are extracted from the image with the note. In step S5, 
character recognition is performed for the image with 
the note. 

Fig. 3 explains the concept of a note region 
extraction process. 

Fig. 3A shows image data of a document in which 
a note is taken. This is the image data that is generated 
by capturing with a scanner, an electronic camera, etc. 
an electronic document that is printed on paper and 
distributed, in which the note is taken. Fig. 3B shows 
document image data generated from the electronic 
document. A difference between these image data exists 
in a point that the note is included in the image data 
shown in Fig. 3A. If the image data of Figs. 3A and 3B 
are overlaid, preprints such as characters included in 
the electronic document, etc. should overlap. This is 
because the portion other than the note in Fig. 3A is 
printed from the electronic document. When the images 
are overlaid, a differential image, from which 
overlapping characters are removed, is obtained as shown 
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in Fig. 3C. By extracting the remaining image such as 
characters, etc. in the differential image, the note 
region is extracted as shown in Fig. 3D. 

Fig. 4 is a flowchart showing a note information 
5 registration/correlation process. 

If a user requests a document management 
y| registration, a document registration menu is made 

f?l visible on a display. When the user selects an electronic 

r* document registration from the menu via a keyboard, a 

%!' 10 mouse, etc., locates an electronic document file desired 

'i.. to be registered in a specified directory, and inputs 

tfl the name of the electronic document file desired to be 

ill registered via the keyboard being a user interface, the 

Li electronic document registering unit extracts the title, 

15 version number, protection information, document type, 
etc. from the document file, and writes the extracted 
information to a document information file. 
Additionally, the user selects a note registration from 
the document registration menu via the user interface 
20 such as a keyboard, a mouse, etc., and inputs or selects 
from a list the file name of the original electronic 
document in which a note is taken. Furthermore, the user 
inputs the paper document in which the note is taken 
with a scanner, an electronic camera, etc. 
25 The note extracting unit references the document 
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information file, reads the electronic document file 
registered to the location corresponding to the file 
name input by the user, and copies the electronic 
document file in a working area. Furthermore, the note 
5 extracting unit extracts only the note portion by making 
a comparison between the document image with the note, 
which is input from a scanner, an electronic camera, 
etc., and the original electronic document file. The 
note recognizing unit performs character recognition 

10 for the extracted note portion. Furthermore, the note 
registering unit stores a recognition result unchanged 
in a predetermined location if there is no error, or 
stores a corrected recognition result in a predetermined 
location if there is an error. The note registering unit 

15 also stores the note image in a predetermined location. 
The number of notes, a note recognition result, a pointer 
to a note image, and location information of an image 
with a note are written to the entry of the corresponding 
original electronic file in the document information 

20 file. 

The above described process is explained with 
reference to the flowchart shown in Fig. 4. 

Firstly, instepSlO, a document registration menu 
is displayed for a user. In step Sll, an electronic 
25 document to be correlated is registered to a document 
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information file. In step S12, the electronic document 
specified to be correlated is read from an electronic 
document file by referencing the document information 
file. In step S13, a paper document with a note, which 
corresponds to the specified electronic document, is 
input from a scanner, an electronic camera, etc. Then, 
in step S14, a note extraction/recognition process is 
performed. In step S15, a note recognition result is 
displayed for the user. In step S16, the user corrects 
the note recognition result if necessary. In step S17, 
the note recognition result and the note image are 
respectively stored in a note recognition result file 
and a note image file, and at the same time, the 
corresponding information is written to the document 
information file. 

Here, the explanation is provided based on the 
assumption that the note recognition process properly 
runs . Actually, however, the recognition process cannot 
properly run, for example, if a note is not characters. 
Accordingly, whether or not to perform the note 
recognition process may be determined by a user 
specification. In this case, note information 
correlated to an electronic document is only note image 
data. 

Fig. 5 explains the format of a file stored in a 
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document information file. 

If a user issues a request, the document managing 
unit reads/displays an electronic document, a 
corresponding note recognition result, and data of a 
5 note image by referencing the document information file 
shown in Fig. 1. The document management file stores 
the number of notes, an array of pointers to note 
recognition result files, an array of pointers to note 
image files, location information of each note in a paper 

10 document, etc. in addition to a file name, a document 
title, file size information such as the number of pages, 
the number of columns, a data size, etc., protection 
information (write protection, etc.), a registration 
date and time, a document type, and a pointer pointing 

15 to an electronic document file, which indicates a data 
location in a memory. 

Here, the location information of each note in a 
paper document indicates in which portion of a document 
a note exists. For example, when an electronic document 

20 is displayed on a screen of a word processor, a line 
or a column number, which approximately indicates the 
location in which a note is taken, may be available, 
or the value (centimeters or inches) of a ruler scale 
of the word processor may be available for the location 

25 of a note if the word processor manages the location 



of a character on paper in units such as centimeters, 
inches, etc. 

Furthermore, in the preferred embodiment of the 
present invention, a character recognition process is 
performed for a note, and a recognition result is stored 
as character code. Therefore, only an electronic 
document but also a note recognition result can be used 
as a search target, when a document search is made. 

Fig. 6 is a flowchart showing a document search 
process . 

When a user issues a document search request and 
inputs a word that he or she desires to search via a 
user interface such as a keyboard, a mouse, etc., the 
document searching unit references a document 
information file, searches for the character code 
corresponding to the requested word in each electronic 
document data and a note recognition result correlated 
by the document information file, and makes the result 
visible on a display. 

Namely, if a user specifies a word to be searched 
in step S20, a document information file is referenced 
in step S21, and the character code of the specified 
word is searched in each electronic document file and 
its correlated note recognition result in step S22. At 
this time, also the character codes of words within the 
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electronic document are searched. Then, in step S23, 
the electronic document and the note recognition result, 
which are found as a result of the search, are displayed. 

Fig. 7 exemplifies a display of a document list 
5 in the case where note data exists. For a document with 
note data, for example, an indication "note" is attached 
to the beginning of the document title. In the example 
shown in Fig. 7, the indication "note" is attached to 
a study result report 1, so that the presence of note 

10 data in addition to the electronic document is notified 
to a user. Furthermore, it is indicated that note data 
exists for a meeting material 1 but not for a meeting 
material 2. The other materials are similar. 

As described above, when an electronic document 

15 data list is displayed, whether or not note data 
correlated to an electronic documents exists is 
indicated. 

Figs. 8A and 8B exemplify a display of an original 
electronic document, a note recognition result, and a 
20 note image. 

A user selects a menu in a toolbar in a window with 
a mouse, a keyboard, etc. depending on need, so that 
a display of a note or a note image is toggled on and 
off (see Fig. 8B) . For example, a note is inserted and 
25 displayed in a line corresponding to the location in 
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which the note is taken by changing its color, whereas 
a note image is displayed in another window. Fig. 8A 
shows an example where a display of a note and a note 
image is toggled on. 

Fig. 9 explains the hardware environment of an 
information processing device that is required when an 
apparatus according to the preferred embodiment is 
implemented by causing the information processing 
device to execute a program. 

A CPU 21 is connected to an external storage device 
25 such as a hard disk, or a medium driving device 26 
via a bus 28. The medium driving device 2 6 reads data 
of a program, etc. from a portable storage medium 29 
such as a floppy disk, a CD-ROM, a DVD, etc. The program 
is read from the external storage device 25 or the 
portable storage medium 2 9, copied in a memory 22, and 
executed by the CPU 21. An input device 23 is configured 
by a keyboard, a mouse, a display, a scanner, an 
electronic camera, etc., and used to notify the CPU 21 
of a user instruction, or to read a paper document with 
a note as an image. In the external storage device 25 
or onto the portable storage medium 29, a paper document 
with a note, an original electronic document, etc. are 
stored. Especially, the document information file 9, 
the original document file 10, the note recognition 
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result file 11, the note image file 12, the character 
recognition dictionary 13, etc. , which are shown in Fig. 
1, are configured. 

An output device 24 is configured by a display, 
5 etc., and makes a display as shown in Fig. 7 or 8 . This 
device configures a user interface along with the input 
device 23, such as providing a user with necessary 
information, or displaying a screen that prompts a user 
to make an input, etc. 

10 A network connecting device 27 is a device for 

connecting the information processing device to a 
network. This device is used to download the program 
via a network, or to access the above described files 
via a network if the files are stored in separate 

15 locations. 

Fig. 10 explains a use pattern of a program (data) . 
An information processing device 31 can store a 
program in a memory 32 such as a RAM, a hard disk, etc., 
and can execute the program. Or, the information 

20 processing device 31 may execute the program by loading 
it from a storage medium 34 such as a CD-ROM, a floppy 
disk, etc. 

Furthermore, the information processing device 31 
can access a program (data) provider 30, use a program 
25 and data by downloading them, or use the program and 
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data under a network environment. 

According to the present invention, also a note 
taken in a paper document printed from an electronic 
document can be managed as electronic data, whereby 
5 information can be electronically managed in a unified 
manner without storing a paper medium in which a note 
is taken. 



